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ABSTRACT 

Cloud-type  classification  based  on  multispectral  satellite  imagery  data  has  been  widely  researched  and 
demonstrated  to  be  useful  for  distinguishing  a  variety  of  classes  using  a  wide  range  of  methods.  The  research 
described  here  is  a  comparison  of  the  classifier  output  from  two  very  different  algorithms  applied  to  Geo¬ 
stationary  Operational  Environmental  Satellite  (GOES)  data  over  the  course  of  one  year.  The  first  algorithm 
employs  spectral  channel  thresholding  and  additional  physically  based  tests.  The  second  algorithm  was 
developed  through  a  supervised  learning  method  with  characteristic  features  of  expertly  labeled  image 
samples  used  as  training  data  for  a  1 -nearest-neighbor  classification.  The  latter’s  ability  to  identify  classes  is 
also  based  in  physics,  but  those  relationships  are  embedded  implicitly  within  the  algorithm.  A  pixel-to-pixel 
comparison  analysis  was  done  for  hourly  daytime  scenes  within  a  region  in  the  northeastern  Pacific  Ocean. 
Considerable  agreement  was  found  in  this  analysis,  with  many  of  the  mismatches  or  disagreements  providing 
insight  to  the  strengths  and  limitations  of  each  classifier.  Depending  upon  user  needs,  a  rule-based  or  other 
postprocessing  system  that  combines  the  output  from  the  two  algorithms  could  provide  the  most  reliable 
cloud-type  classification. 


1.  Introduction 

Automated  cloud-type  classification  in  satellite  im¬ 
agery  is  a  valuable  resource  in  both  the  operational  and 
research  communities.  Cloud  classifier  output  provides 
useful  information  to  researchers  and  operational  users 
alike.  In  an  instantaneous  sense,  knowledge  of  cloud 
types  in  a  given  scene  improves  the  retrieval  of  cloud 
parameters  (e.g.,  by  providing  a  priori  information  on 
liquid/ice/mixed  phase).  When  analyzed  over  a  long  time 
period  this  knowledge  contributes  to  the  analysis  of  ra- 
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diation  and  heat  budgets,  which  are  impacted  differently 
depending  upon  the  cloud  types  (Li  et  al.  2007).  Owing 
to  the  importance  of  clouds  in  climate  feedback  pro¬ 
cesses,  an  improved  understanding  of  cloud-type  distri¬ 
bution  and  its  change  over  time  would  benefit  climate 
research  (Wang  and  Sassen,  2001).  Important  operational 
uses  of  cloud-type  classification  include  the  identifica¬ 
tion  of  convective  clouds  over  oceanic  regions,  where 
observational  data  are  sparse  (Donovan  et  al.  2008).  In 
addition  to  the  identification  of  convective  clouds,  diag¬ 
nosing  areas  of  fog/stratus  and  supercooled  liquid  clouds 
would  positively  impact  aviation  route  planning. 

To  determine  the  cloud  type  of  a  pixel  or  group  of 
pixels  in  satellite  imagery,  an  appropriate  classification 
algorithm  must  be  selected.  Algorithm  choice  is  driven 
in  part  by  the  cloud  types  of  interest  and  the  intended 
use(s)  of  the  output.  These  algorithms  can,  in  general, 
be  grouped  into  theoretical/physical  (explicit  physics. 
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hereinafter  referred  to  as  EP)  or  empirical/statistical 
(implicit  physics,  hereinafter  referred  to  as  IP)  methods. 
Recent  research  in  the  related  areas  of  scene  classifi¬ 
cation  and  cloud  property  retrievals  is  summarized  in 
the  next  section. 

Validation  of  an  individual  classifier  applied  to  real¬ 
time  or  unlabeled  testing  data  is  difficult  given  the  lack 
of  independent  validation  data.  The  objectives  of  the 
research  described  here  are  to  validate  and  identify  the 
strengths  and  limitations  of  two  classifiers,  one  based  on 
EP  methods  and  the  other  based  on  IP  methods,  through 
comparison  of  their  output.  Geostationary  Operational 
Environmental  Satellite-11  (GOES-11;  135°W  Ion)  data 
over  a  1-yr  period  serve  as  the  dataset.  This  comparison 
requires  a  reconciliation  of  the  various  cloud  categories 
defined  in  each  of  the  algorithms.  Although  neither  clas¬ 
sifier  output  should  be  considered  to  be  “truth,”  classifier 
agreement  can  enhance  the  confidence  in  the  output  of 
both  classifiers.  The  purpose  of  this  research  is  to  doc¬ 
ument  agreements  and  to  explain  the  disagreements 
between  the  two  algorithms.  Future  research  will  apply 
this  analysis  to  the  refinement  of  current  algorithms  or 
development  of  new  ones. 

The  EP  cloud-type  algorithm  is  described  in  section  3, 
and  the  IP  algorithm  is  described  in  section  4.  Com¬ 
parison  results  and  analysis  are  presented  in  section  5, 
followed  by  a  summary  discussion  in  section  6. 

2.  Related  research 

Many  examples  of  both  EP  and  IP  approaches,  ap¬ 
plied  to  various  sensor  data  for  a  variety  of  classification 
problems,  can  be  found  in  the  research  literature.  The 
EP  algorithms  relate  the  spectral  and  spatial  contrasts 
observed  in  multispectral  imagery  to  characteristics  of 
various  cloud  types.  For  example,  the  spatial  variation  in 
visible  reflectance  or  infrared  brightness  temperature 
provides  textural  information  for  differentiating,  for  ex¬ 
ample,  stratus  and  stratocumulus  cloud  fields.  Absolute 
thresholds  in  temperature  or  reflectance  provide  infor¬ 
mation  on  the  height  and/or  opacity  of  the  cloud  that  help 
to  relate  it  to  a  classification.  Differential  optical  prop¬ 
erties  can  also  be  exploited,  such  as  the  “split  window” 
difference  for  detecting  thin  cirrus  clouds.  Coupling  that 
information  with  visible  reflectance  provides  a  means  to 
detect  thin  cirrus  overlapping  lower-level  cloud  (Heidinger 
and  Pavolonis  2005).  Still  other  channel  combinations 
enable  the  distinction  among  liquid,  ice,  and  super¬ 
cooled  or  mixed  phase  cloud-top  conditions.  These  and 
other  techniques  as  applied  to  EP  cloud  masking  and 
typing  are  described  by  Pavolonis  and  Heidinger  (2004). 

Research  has  also  been  done  on  testing  the  capability 
of  the  Visible/Infrared  Imager  Radiometer  Suite  (VIIRS) 


cloud  mask  algorithm  as  applied  to  Moderate  Resolution 
Imaging  Spectroradiometer  (MODIS)  data  (Hutchison 
et  al.  2005).  The  MODIS  cloud  mask  is  discussed  in 
Ackerman  et  al.  (2008).  Using  a  grouped  threshold  ap¬ 
proach  in  addition  to  the  application  of  radiative  transfer 
modeling,  cloud  detection  and  classification  algorithms 
were  developed  for  Advanced  Very  High  Resolution 
Radiometer  (AVHRR)  data  in  Dybbroe  et  al.  (2005). 
Earlier  work  applying  a  grouped  threshold  method,  with 
the  aid  of  radiative  transfer  calculations,  to  AVHRR 
scene  identification  is  discussed  by  Baum  and  Trepte 
(1999).  Wang  and  Sassen  (2001)  describe  a  physically 
based  algorithm  developed  and  applied  to  ground-based 
remote  sensors.  Cloud  type  and  macrophysical  cloud 
properties  were  identified.  Directly  related  to  this  re¬ 
search,  cloud  property  retrieval  algorithms  from  GOES 
data  are  described  in  Mitrescu  et  al.  (2006). 

Machine  learning  techniques  have  been  applied  to 
various  image  classification  problems  for  many  years. 
A  majority  of  these  classification  tasks  are  approached 
as  supervised  learning  problems,  in  which  previously 
classified  samples  from  a  historical  dataset  are  repre¬ 
sented  by  a  characteristic  feature  vector  and  serve  as  a 
set  of  training  samples.  Mazzoni  et  al.  (2007)  apply  a 
supervised  learning  technique  in  the  form  of  support 
vector  machines  to  scene  classification  within  Multiangle 
Imaging  Spectroradiometer  (MISR)  data.  Baldwin  et  al. 
(2005)  apply  a  nearest-neighbor  algorithm  to  the  clas¬ 
sification  of  rainfall  systems  in  radar  data.  The  training 
sets  were  class  labeled  using  cluster  analysis,  an  unsu¬ 
pervised  learning  method,  as  opposed  to  being  manually 
labeled.  Parikh  et  al.  (1997)  apply  neural  networks,  ge¬ 
netic  algorithms,  and  statistical  methods  to  the  recognition 
and  tracking  of  midlatitude  cloud  systems  in  cloud-top 
pressure  datasets.  Baum  et  al.  (1997)  use  labeled  AVHRR 
samples  to  train  a  fuzzy  logic  cloud  classifier. 

Classification  and  retrieval  schemes  have  also  been 
developed  using  a  combination  of  EP  and  IP.  Oceanic 
convective  cloud  diagnoses  are  performed  using  a  fu¬ 
sion  of  output  from  a  1-nearest-neighbor  cloud  classifier 
(Bankert  and  Wade  2007)  and  a  thresholding  technique 
for  deep  convection  (Schmetz  et  al.  1997;  Mosher  2002) 
to  produce  an  enhanced  product  (Donovan  et  al.  2008). 
Li  et  al.  (2007)  compare  different  satellite  sensors  (present 
and  future)  in  cloud  classification  using  the  MODIS  cloud 
mask  as  the  initial  classification  for  a  maximum  likelihood 
classification  procedure  (unsupervised  learning).  Seemann 
et  al.  (2003)  apply  a  statistical  retrieval  algorithm  in 
combination  with  a  nonlinear  physical  retrieval  algo¬ 
rithm  to  MODIS  data  in  the  retrieval  of  atmospheric 
temperature  and  moisture  distribution,  total  column 
ozone,  and  surface  skin  temperature.  Fouilloux  and  la- 
quinta  (1998)  present  an  AVHRR  cloud  classification 
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Table  1.  Cloud  type  outputs  given  by  the  “explicit  physics” 
classification  algorithm. 

Clear  (Clr) 

Partly  cloudy 
Liquid  water  (Liq) 

Supercooled  water  or  mixed  phase  (Mix) 

Glaciated-opaque  ice  (Glac) 

Cirrus  (Ci) 

Cloud  overlap  (OL) 


algorithm  based  on  physical  and  textural  properties  used 
in  combination  with  neural  networks  for  the  extraction  of 
cloud  optical  thickness  and  droplet  effective  radius. 

3.  Cloud-type  algorithm — Explicit  physics 

Based  on  the  research  described  in  Pavolonis  et  al. 
(2005),  Pavolonis  and  Heidinger  (2004),  and  Mitrescu 
et  al.  (2006),  an  EP  algorithm  to  determine  the  cloud 
type  of  a  given  cloudy  pixel  in  GOES  imagery  is  de¬ 
veloped.  This  EP  classifier  provides  a  front-end  filter 
for  locating  and  identifying  the  appropriate  cloud-top- 
properties  retrieval  algorithm.  Using  a  series  of  thresh¬ 
olding  tests  on  all  five  GOES  imager  channels,  each  pixel 
is  assigned  one  of  the  cloud  types  listed  in  Table  1.  Pixels 
classified  as  partly  cloudy  are  ignored  for  this  study.  The 
algorithm  was  incorporated  into  the  Naval  Research 
Laboratory  (NRL)  automated  processing  system  and 
coupled  to  the  Navy  Operational  Global  Atmospheric 
Prediction  System  for  auxiliary  data  requirements. 

Using  a  cloud  mask  algorithm  (Heidinger  2004)  that 
includes  spatial  uniformity  information,  a  pixel  is  first 
determined  to  be  clear  (no  cloud),  partly  cloudy,  or 
cloudy.  For  cloudy  pixels,  a  test  for  semitransparent  ice 
cloud  overlapping  liquid  water  droplet  cloud  (Pavolonis 
and  Heidinger  2004)  is  performed.  This  test  checks 
the  behavior  of  the  visible  channel  reflectance  and  the 
11-  and  12-/im  brightness  temperature  difference  (split 
window).  A  cirrus  (Ci)  test  for  transmissive  (optically 
thin)  ice  clouds,  as  described  in  Pavolonis  et  al.  (2005),  is 
also  applied.  Both  the  overlapping  test  and  thin  cirrus 
test  are  designed  to  minimize  the  false  alarms  of  each 
type;  therefore,  one  would  expect  occasional  misclassi- 
fications  when  Ci  or  overlapping  clouds  are  present.  If 
both  of  these  tests  fail,  the  appropriate  (as  determined 
by  the  11 -/im  channel  brightness  temperature)  cloud 
phase  tests  for  1)  liquid  water — brightness  temperature 
greater  than  273  K,  2)  supercooled  water  or  mixed 
phase — composed  entirely  of  supercooled  water  drop¬ 
lets  or  both  ice  and  supercooled,  and  3)  glaciated 
(optically  thick  ice)  clouds — entirely  ice  crystals  or  gla¬ 
ciated  tops  (e.g.,  deep  convection) — are  applied  and  the 
pixel’s  cloud  type  is  assigned. 


Table  2.  Classes  used  in  cloud  classifier  (implicit  physics). 

Stratus  (St) 

Stratocumulus  (Sc) 

Cumulus  (Cu) 

Altocumulus  (Ac) 

Altostratus  (As) 

Cirrus  (Ci) 

Cirrocumulus  (Cc) 

Cirrostratus  (Cs) 

Cumulus  congestus  (CuC) 

Cumulonimbus  (Cb) 

CsAn — Cs  near  turret  in  thunderstorm;  more  closely  related  to 
deep  convection  than  “garden  variety”  Cs 
Clear  (Clr) 

Ground  snow  (Sn) 

Haze  (Hz) 

Sun  glint  (Sg) 


4.  GOES  cloud  classifier — Implicit  physics 

Using  a  supervised  learning  method  that  was  first  ap¬ 
plied  to  AVHRR  data  (Tag  et  al.  2000),  an  IP  cloud 
classifier  has  been  developed  and  further  refined  for 
application  to  GOES  data  (Bankert  and  Wade  2007).  A 
training  dataset  is  established  through  independent  ex¬ 
pert  agreement  of  thousands  of  labeled  16  X  16  pixel 
samples.  The  classes  used  by  the  experts  are  listed  in 
Table  2.  In  addition  to  the  imagery,  the  experts  had 
synoptic  weather  charts  and  other  data  available  to  assist 
in  the  class  assignment  of  each  training  sample.  General 
cloud  identification  is  one  use  of  the  classifier,  but  more 
specific  or  application- driven  uses  are  possible.  The  IP 
cloud  classifier  is  currently  being  used  for  an  oceanic 
convective  diagnosis  and  nowcasting  system  (Kessinger 
et  al.  2009).  Other  potential  uses  include  snow/cloud  de¬ 
lineation  and  low-cloud  or  thin  cirrus  detection,  depend¬ 
ing  upon  any  given  user’s  specific  needs  or  application. 

Each  expert-labeled  training  set  sample  is  represented 
by  a  vector  of  characteristic  features  computed  or  ex¬ 
tracted  from  each  spectral  channel  with  a  final  subset 
of  features  (Bankert  and  Wade  2007)  chosen  through 
a  feature  selection  routine  (Bankert  and  Aha  1996). 
Various  training  sets  were  established,  differentiated 
by  satellite  (GOES-East  or  GOES-West),  sea  or  land, 
and  daytime  or  nighttime  scenes.  For  this  research, 
GOES-West  land  day  and  GOES-West  sea  day  training 
sets  are  used  within  a  1 -nearest-neighbor  classifier.  Day¬ 
time  observations  offer  an  increased  number  of  classes 
because  of  the  availability  of  visible-band  data.  The 
minimum  distance  in  feature  space  between  an  unclas¬ 
sified  sample  presented  to  the  classifier  and  the  training 
data  samples  is  found,  and  the  class  label  of  the  nearest- 
neighbor  training  sample  is  subsequently  assigned  to 
each  pixel  in  the  unclassified  sample. 
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Fig.  1.  GOES-11  visible  image  depicting  the  region  used  for  the  comparison  study. 


As  described  in  Tag  et  al.  (2000),  classifications  of 
overlapping  boxes  (a  16  X  16  pixel  window  is  applied 
every  8  pixels)  within  each  image  are  performed.  Each 
image  pixel  in  a  given  16  X  16  pixel  box  is  assigned 
the  same  class,  resulting  in  every  pixel  being  classified 
four  times  (excluding  image  edges).  The  majority  class 
is  assigned  (ties  broken  randomly)  to  each  pixel,  fol¬ 
lowed  by  a  postprocessing  routine  that  applies  conser¬ 
vative  measures  to  check  the  classification  validity  of 
each  pixel.  The  use  of  texture  measures  within  the  fea¬ 
ture  set  and  the  use  of  overlapping  boxes  help  to  over¬ 
come  some  of  the  limitations  associated  with  pixel-based 
classifications  and  provide  a  more  robust  classification. 
Given  the  class  types  used  in  this  IP  algorithm,  single 
pixel  spectral  information  alone  would  not  have  provided 
satisfactory  results.  Because  each  box  is  assigned  a  spe¬ 
cific  class,  no  “multiple,”  “overlapping,”  or  “unknown” 
class  is  used.  For  this  analysis,  pixels  classified  as  ground 
snow,  haze,  and  sun  glint  are  ignored. 

5.  Algorithm  comparison  analysis 

The  EP  and  IP  algorithms  were  applied  to  hourly 
daytime  GOES-11  data  for  a  1-yr  period  (from  October 
2006  to  October  2007)  in  the  northeastern  Pacific  Ocean 
(Fig.  1).  Both  algorithms  define  a  pixel  as  daytime  if  the 
solar  zenith  angle  is  below  a  specified  threshold.  To 
simplify  the  pixel  comparison  between  the  two  classi¬ 
fiers,  the  IP  classes  are  combined  to  form  a  set  of  classes 


that  match  the  EP  cloud  classes.  This  clustering  of  classes 
is  summarized  in  Table  3.  Note  that  no  overlapping 
cloud  class  is  possible  with  the  IP  algorithm.  Because 
the  method  and  criteria  used  to  define  the  classes  for 
each  classifier  are  different,  these  clusters  are  not  per¬ 
fect  matches  for  all  pixels.  However,  analysis  of  the 
comparisons  should  provide  an  indication  as  to  whether 
disagreements  are  a  result  of  algorithm  limitations,  class 
definitions,  or  a  combination  of  the  two. 

More  than  1.4  billion  classified  pixel  pairs  (from  4295 
GOES-11  images)  were  compared  over  the  year-long 
test  period.  A  percentage  distribution  of  all  possible 

Table  3.  The  IP  class  clusters  used  for  comparisons  with 
EP  cloud  types. 

Liquid  water 
Stratus  (St) 

Stratocumulus  (Sc) 

Cumulus  (Cu) 

Mixed  phase/supercooled  water 
Altocumulus  (Ac) 

Altostratus  (As) 

Cumulus  congestus  (CuC) 

Glaciated 
Cirrocumulus  (Cc) 

Cirrostratus  (Cs) 

Cumulonimbus  (Cb) 

CsAn 

Clear  (Clr) — not  combined 
Cirrus  (Ci) — not  combined 
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(a) 


Fig.  2.  (a)  Percent  frequency  of  occurrence,  over  entire  dataset, 
of  each  possible  pixel  classification  combination  for  the  two  algo¬ 
rithms  (EP  and  IP),  (b)  As  in  (a)  after  removing  all  OL  samples. 


combinations  for  classifications  within  the  two  classifiers 
is  presented  in  Fig.  2a.  Most  of  the  higher  percentages 
(greater  than  2%  of  the  total  pixels)  of  mismatches  or 
disagreements  occurred  with  pixels  classified  by  the  EP 
algorithm  as  either  cirrus  (Ci)  or  overlap  (OL).  The  lack 
of  an  OL  class  in  the  IP  classifier  is  an  obvious  expla¬ 
nation  for  this  part  of  the  distribution.  A  higher  total 
percentage  of  Ci  samples  in  the  EP  algorithm  classifi¬ 
cations  (14.9%),  relative  to  the  IP  classifier  (7.3%),  is  a 
result  of  limitations  in  the  IP  algorithm  and  will  be 
discussed  later.  To  get  a  more  direct  comparison  on  the 


□  EP  DIP 


Fig.  3.  Percentage  distribution  of  total  pixels  for  the  classes  within 
each  classifier,  disregarding  OL  pixels. 

“cloud  climatology”  for  this  region  produced  by  each 
algorithm,  all  pixels  classified  as  OL  by  EP  are  removed 
from  the  distribution  matrix  to  produce  the  same  set  of 
classes  for  each  algorithm  and  the  probability  distribu¬ 
tion  is  recomputed.  These  results  are  presented  in  Fig.  2b, 
with  total  percentages  for  each  class  shown  in  Fig.  3. 
The  IP  classifier  has  a  higher  preference  for  liquid  and 
glaciated  clouds,  whereas  the  EP  algorithm  prefers 
mixed  (supercooled  water  droplets)  and  Ci  clouds  as 
compared  with  the  IP  classifier.  The  discrepancy  in 
the  number  of  Ci  samples  between  the  two  algorithms 
stands  out. 

Other  summaries  of  the  pixel-to-pixel  comparisons 
over  the  entire  dataset  can  be  found  in  Figs.  4a  and  4b. 
The  percent  distributions  of  all  pixels  over  the  entire 
year  within  a  specific  EP  cloud  class  and  the  coincident 
IP  cloud  class  (as  defined  in  Table  3)  are  displayed  in 
Fig.  4a.  For  example,  57.2%  of  the  pixels  classified  as 
mixed  phase  or  supercooled  water  by  the  EP  algorithm 
were  classified  as  one  of  the  liquid  IP  cloud  classes. 
Figure  4b  is  a  graph  of  the  percent  distributions  within 
an  IP  class  and  the  coincident  EP  cloud  type. 

As  evident  in  Fig.  4,  a  considerable  amount  of  agree¬ 
ment  exists  between  the  EP  and  IP  classifiers.  A  signifi¬ 
cant  exception  is  the  high  percentage  of  pixels  classified 
as  supercooled  (or  mixed)  by  the  EP  algorithm  and  as 
one  of  the  liquid  water  cloud  classes  by  the  IP  algorithm. 
A  discussion  for  that  specific  disagreement  follows  be¬ 
low.  The  most  notable  agreements,  with  the  highest 
percentages,  are  within  the  clear  (no  cloud)  and  liquid 
water  cloud  classes,  which  also  have  the  highest  fre¬ 
quency  of  occurrence  within  the  entire  dataset  (Fig.  2). 
Although  classifier  agreement  increases  confidence  in 
each  classifier  output,  some  of  the  disagreements  are  a 
result  of  the  different  original  class  compositions  and  how 
those  classes  were  defined.  Therefore,  neither  classifier 
may  actually  be  in  error  for  certain  situations.  A  further 
analysis  of  the  results,  along  with  knowledge  of  each  al¬ 
gorithm’s  strengths  and  limitations,  led  to  the  following 
observations  for  specific  notable  disagreements: 
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Fig.  4.  (a)  Percent  distribution  of  pixels  within  each  EP  cloud 
type  as  classified  by  the  IP  classifier;  EP  axis  columns  sum  to 
—100%.  (b)  Percent  distribution  of  pixels  within  each  IP  cloud  class 
as  classified  by  the  EP  classifier;  IP  axis  columns  sum  to  —100%. 

1)  IP  =  liquid  (St,  Sc,  Cu)  and  EP  =  supercooled  (or 
mixed):  Greater  than  one-half  of  the  EP  supercooled 
pixels  were  paired  with  IP  liquid  cloud  classifica¬ 
tions.  Some  cases  reflect  a  known  bias  of  EP  toward 
higher  supercooled  water  or  mixed  phase  type.  The 
frequent  occurrence  of  clouds  generated  in  cold  air 
masses  that  pass  over  the  Pacific  results  in  most  of 
these  mismatches.  Figure  5  is  presented  as  an  exam¬ 
ple.  The  cloud  tops  are  too  cold  to  meet  the  threshold 
for  liquid  clouds  in  the  EP  algorithm  (Fig.  5b). 


Physics  allows  for  these  clouds  to  be  composed  of 
liquid  water  droplets,  but  in  a  supercooled  state.  The 
training  set  of  the  IP  classifier  (Fig.  5a)  contains 
similar  cold-air  cloud  samples  that  were  classified  as 
Cu  and  Sc  by  the  experts  because  of  the  estimated 
cloud  height.  Therefore,  based  on  the  class  defini¬ 
tions  for  each  algorithm,  no  misclassification  occurs 
in  either  classifier.  Similar  to  this  situation  are  pixels 
classified  as  “mixed”  by  IP  and  “glaciated”  by  EP. 
These  pixels  could  be  actual  midlevel  atmosphere 
clouds  (defined  as  As  and  Ac  within  the  IP  classifier) 
with  glaciated  or  very  cold  tops.  Less  frequent  is  that 
this  situation  could  be  a  result  of  very  thin  Ci  over¬ 
lapping  a  low  cloud  deck  (i.e.,  the  OL  cloud  type). 
The  thin  Ci  signal  is  missed  by  the  IP  classifier  and 
the  test  for  OL  cloud  type  fails  in  the  EP  algorithm, 
most  likely  because  of  the  split-window  threshold 
not  being  exceeded.  The  pixel  is  then  classified  by 
EP  as  supercooled  water. 

2)  IP  =  mixed  (As  or  Ac)  and  EP  =  overlap:  The  IP 
algorithm  does  not  output  an  OL  class;  therefore,  ac¬ 
tual  OL  pixels  are  being  classified  as  As  or  Ac  (Fig.  5a; 
white  oval)  with  signals  from  both  low  cloud  and 
overlying  Ci  being  used  to  give  a  mixed  phase  clas¬ 
sification  as  found  in  the  nearest-neighbor  training 
data.  In  Fig.  5  (white  oval),  thin  high  clouds  are 
streaming  over  the  low  clouds  associated  with  the 
front. 

3)  IP  =  liquid  and  EP  =  Ci:  Two  aspects  of  the  IP  al¬ 
gorithm  negatively  affect  its  ability  to  classify  Ci  cor¬ 
rectly.  Either  the  classification  assignment  method  (in 
which  all  pixels  in  a  given  box  are  classified  as  the 
same  class)  or  the  postprocessing  check  of  IR  bright¬ 
ness  temperature  for  initially  classified  Ci  samples 
(performed  to  lower  the  number  of  Ci  false  alarms) 
leads  to  a  misclassification  in  the  IP.  Also,  these  could 
be  OL  pixels  that  are  missed  by  both  methods  (see 
Fig.  6:  area  enclosed  by  black  oval).  Here  the  OL  test 
fails  in  the  EP  algorithm  but  the  Ci  test  confirms  the 
presence  of  thin  high  clouds  (e.g..  Fig.  6b).  For  this 
type  of  disagreement,  based  on  either  reasoning,  one 
would  expect  Ci  to  be  present  in  the  pixel. 

4)  IP  =  mixed  and  EP  =  Ci:  Examples  found  in  the 
dataset  imply  that  both  classifiers  are  missing  OL 
type  for  this  situation.  The  IP  algorithm  is  getting 
signals  from  both  types  (e.g..  Fig.  6a;  white  oval)  and 
classifying  the  pixel  as  mixed  phase  (As  or  Ac),  and 
the  OL  test  fails  and  the  Ci  test  passes  in  the  EP 
algorithm  (e.g..  Fig.  6b;  white  oval). 

5)  IP  =  Ci  and  EP  =  supercooled  (or  mixed):  Although 
not  nearly  as  prevalent  as  the  opposite  situation 
(observation  type  4),  Ci  (or  OL  type)  is  missed  by  the 
EP  classifier  but  Ci  is  detected  by  the  IP  classifier. 
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Fig.  5.  Example  case  (1700  UTC 16  Apr  2007)  of  (a)  IP  classification  of  low  clouds  (St,  Sc,  and  Cu;  blue  colors)  and  (b)  EP  classification 
of  supercooled  (mixed)  clouds  (green)  for  the  same  pixels,  mainly  in  cold  air  behind  the  front,  (c)  GOES-11  visible  image  and  (d)  IR 
image  are  also  provided  for  reference. 


Actual  OL  could  also  be  misclassified  as  Ci  by  the  IP 
classifier.  Figure  7a  (area  within  black  oval)  provides 
an  example  in  which  an  overlapping  cloud  situation  is 
misclassified  as  Ci  by  IP  (which  has  no  OL  class)  and 
the  same  area  in  Fig.  7b  is  classified  as  supercooled 
water  or  mixed  phase  by  EP.  Again,  regardless  of 
whether  the  actual  classification  is  Ci  or  OL,  thin  high 
clouds  are  known  to  be  present  in  the  pixel. 

6)  IP  =  clear  and  EP  =  liquid:  The  IP  algorithm  can 
miss  thin  low-cloud  pixels  near  the  terminator  (high 
solar  zenith  angle)  resulting  in  a  clear  (no  cloud) 
classification.  In  some  instances,  the  IP  postprocess¬ 
ing  check  (for  the  minimum  visible  channel  albedo 
threshold  for  cloud  detection)  can  change  an  original 
liquid  cloud  class  to  clear  or  there  could  be  low  thin 
clouds  misclassified  as  clear  (e.g..  Fig.  8a;  black  oval). 

7)  IP  =  Glac  and  EP=  Ci  or  OL:  These  pixel  pair  clas¬ 
sification  outputs  are  most  likely  the  result  of  class 


definitions  (particularly  with  regard  to  optical  thick¬ 
ness  for  Ci),  lack  of  OL  class  in  IP,  and  classifier 
design  rather  than  misclassifications.  Of  interest  is 
that  more  pixels  classified  by  the  IP  algorithm  as  cir- 
rostratus  and  cirrocumulus  were  paired  with  EP  clas¬ 
sification  of  OL  than  Glac,  whereas  pixels  classified  by 
IP  as  cumulonimbus  and  CsAn  (see  Table  2)  had  a 
higher  frequency  pairing  with  Glac  than  OL.  These 
distributions  are  indicative  of  the  optical  thickness  of 
the  clouds  (e.g.,  manifesting  in  magnitude  of  visible 
reflectance)  as  used  indirectly  in  the  class  definitions. 

6.  Discussion 

In  the  ideal  case,  the  validation  of  a  cloud  classification 
algorithm  would  involve  a  truth  dataset  of  cloud  types  to 
match  against  the  classifier  output.  The  truth  dataset 
could  be  constructed  from  a  satellite  interpretation 
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Fig.  6.  The  1900  UTC  27  Mar  2007  (a)  IP  classification,  (b)  EP  classification  with  (c)  GOES-11  visible  channel  and  (d)  IR  channel  images. 


expert’s  analysis  and/or  through  further  consensus  with 
more  specialized  satellite  observations  [e.g.,  active  sen¬ 
sors  such  as  CloudSat  and  Cloud-Aerosol  Lidar  and  In¬ 
frared  Pathfinder  Satellite  Observation  (CALIPSO)]. 
Although  developing  truth  data  was  not  practical  and  no 
other  observational  datasets  were  available  for  validating 
the  classifiers  in  the  current  study,  comparing  the  output 
of  each  classifier  with  knowledge  of  their  respective 
strengths  and  limitations  leads  to  a  more  complete  un¬ 
derstanding  of  performance.  Future  efforts,  especially 
those  with  a  more  limited  time  period  and  areal  cover¬ 
age,  connected  with  field  campaign  observations  will 
provide  opportunities  along  these  lines. 

Many  of  the  classifier  disagreements,  as  noted  in  this 
study,  are  a  result  of  the  lack  of  an  OL  cloud  class  in  the 
IP  algorithm  and/or  missed  OL  cloud  types  in  the  EP 
algorithm.  By  enlisting  active  sensor  data  {CloudSat  and 
CALIPSO),  adding  OL  samples  to  IP  training  data,  and, 
if  necessary,  developing  new  discriminating  features,  it 


is  possible  to  improve  the  IP  classifier  performance  in 
these  situations.  Alternatively,  a  postprocessing  check  on 
specific  pixels  (e.g..  As  and  Ac  pixels)  to  determine  if  an 
overlapping  cloud  situation  exists  could  also  be  im¬ 
plemented.  For  the  EP  algorithm,  an  adjustment  to  the  OL 
test,  which  is  designed  to  minimize  false  alarms,  would 
lower  the  frequency  of  misses  by  this  classifier.  Again,  such 
adjustments  may  enlist  other  observing  systems  such  as 
CloudSat,  CALIPSO,  or  the  1.38-/rm  band  on  MODIS. 
These  potential  improvement  examples  to  each  algorithm 
were  only  revealed  through  the  current  comparisons. 

A  problem  with  both  classifiers — ^in  particular,  in  the 
IP  classifier — is  the  misclassification  of  actual  Ci  pixels 
because  both  algorithms  are  designed  conservatively  to 
minimize  the  number  of  false  alarms  of  Ci.  Slight  modi¬ 
fications  could  be  made  to  either  or  both  Ci  tests  (post¬ 
processing  part  of  the  IP  algorithm)  to  lower  the  number 
of  misclassifications.  Modifications  should  be  applied 
such  that  false  alarms  are  not  significantly  increased. 
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Fig.  7.  Example  case  (2000  UTC  27  Feb  2007)  of  (a)  IP  classification  of  Ci  and  (b)  EP  classification  of  mixed  (supercooled)  clouds  for  the 
same  pixels,  (c)  GOES-11  visible  image  and  (d)  IR  image  are  also  provided  for  reference. 


Cltar  P.Ct^  Liquid  SupCooi  Gi»c  Cirrus  Ovsrtap 


Development  of  a  hybrid  cloud  classification  algo¬ 
rithm  based  on  both  the  IP  and  EP  algorithms  would 
also  overcome  the  OL  and  Ci  problems  in  addition  to 
other  limitations.  This  “classification  adjustment”  al¬ 
gorithm  could  take  the  form  of  a  rule  set  applied  to  each 
pixel.  Pixels  that  have  agreement  in  class  label  would  be 
unchanged,  with  specific  rules  applied  to  those  pixels  in 
disagreement.  For  example,  if  IP  assigns  a  mixed-phase 
class  and  EP  is  OL,  the  pixel  is  classified  as  OL  (rule 
determined  through  this  comparison  research).  Another 
example  would  be  that  if  a  pixel  is  assigned  a  mixed 
phase  class  from  the  IP  algorithm  and  Ci  from  the  EP 
algorithm  the  pixel  is  given  a  final  classification  of  OL. 
Because  of  the  possibility  of  more  than  one  explanation 
for  a  specific  disagreement,  other  rules  or  threshold 
checks  would  be  necessary.  User  needs  and  knowledge 
depth  of  classifier  limitations  would  ultimately  aid  in 
determining  whether  just  one  of  the  individual  classi¬ 


fiers  was  sufficient  or  whether  a  customized  combina¬ 
tion  of  the  two  algorithms  should  be  applied. 
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Fig.  8.  Example  case  (1600  UTC  6  May  2007)  of  (a)  IP  classification  of  Clr,  (b)  EP  classification  of  liquid  clouds  for  the  same  pixels, 

and  the  (c)  associated  GOES-11  visible  image. 


away  prior  to  the  completion  of  this  manuscript.  GOES 
data  were  acquired  through  a  processing  procedure  at 
Code  7541  of  the  Naval  Research  Laboratory. 
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